RNA Secondary Structure Prediction Using Hierarchical Folding
نویسنده
چکیده
Algorithms for prediction of RNA secondary structure— the set of base pairsthat form when an RNA molecule folds— are valuable to biologists who aim tounderstand RNA structure and function. Improving the accuracy and efficiencyof prediction methods is an ongoing challenge, particularly for pseudoknottedsecondary structures, in which base pairs overlap. This challenge is biologicallyimportant, since pseudoknotted structures play essential roles in functions ofmany RNA molecules, such as splicing and ribosomal frameshifting. State-of-the-art methods, which are based on free energy minimization, have high run-time complexity (typically Θ(n) or worse), and can handle (minimize over)only limited types of pseudoknotted structures.We analyze a new approach for prediction of pseudoknotted structures, mo-tivated by the hypothesis that RNA structures fold hierarchically, with pseudo-knot free (non-overlapping) base pairs forming first, and pseudoknots forminglater so as to minimize energy relative to the folded pseudoknot free structure.Our HFold algorithm, based on work of S. Zhao, uses two-phase energy min-imization to predict hierarchically-formed secondary structures in O(n) time,matching the complexity of the best algorithms for pseudoknot free secondarystructure prediction via energy minimization. Our algorithm can handle a widerange of biological structures, including kissing hairpins and nested kissing hair-pins, which have previously required Θ(n) time.We also report on the experimental evaluations of HFold and present thor-ough analyses of the results. We show that if the input structure to the algo-rithm is correct, running the algorithm results in 16% accuracy improvementon average over the accuracy of the true pseudoknot free structures. However ifthe input structure is not correct, the accuracy improvement is not significant.If the first 10 suboptimal foldings are given as input to our algorithm insteadof just the minimum free energy structure (MFE), the prediction accuracy im-proves significantly over the accuracy of the MFE structures. This improvementis even more when the number of suboptimal foldings as input to our algorithmincreases. The comparison of the energy of the structures predicted by HFoldon the true pseudoknot free structures with the energy of the true structurescalculated using a different method with the same energy model shows that theenergy model may be the cause for the cases for which HFold predicts struc-tures far from the true structures. Our experimental result provides some waysin which the hierarchical folding hypothesis might need to be refined.
منابع مشابه
Characteristics and Prediction of RNA Structure
RNA secondary structures with pseudoknots are often predicted by minimizing free energy, which is NP-hard. Most RNAs fold during transcription from DNA into RNA through a hierarchical pathway wherein secondary structures form prior to tertiary structures. Real RNA secondary structures often have local instead of global optimization because of kinetic reasons. The performance of RNA structure pr...
متن کاملSecondary Structure Predict Ion for Circular Rnas
RNAs play an important role in bioinformatic applications. Their ability to serve not only as information carrier, but also to develop catalytic properties highlights them in the set of organic macromolecules notably. As these catalytic properties are closely related to the three-dimensional configuration (tertiary structure) of the RNA molecule, the formation and prediction of this tertiary st...
متن کاملAb initio RNA folding by discrete molecular dynamics: from structure prediction to folding mechanisms.
RNA molecules with novel functions have revived interest in the accurate prediction of RNA three-dimensional (3D) structure and folding dynamics. However, existing methods are inefficient in automated 3D structure prediction. Here, we report a robust computational approach for rapid folding of RNA molecules. We develop a simplified RNA model for discrete molecular dynamics (DMD) simulations, in...
متن کاملA folding algorithm for extended RNA secondary structures
MOTIVATION RNA secondary structure contains many non-canonical base pairs of different pair families. Successful prediction of these structural features leads to improved secondary structures with applications in tertiary structure prediction and simultaneous folding and alignment. RESULTS We present a theoretical model capturing both RNA pair families and extended secondary structure motifs ...
متن کاملRelation Between RNA Sequences, Structures, and Shapes via Variation Networks
Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...
متن کاملPrediction of RNA Pseudoknots Using Heuristic Modeling with Mapping and Sequential Folding
Predicting RNA secondary structure is often the first step to determining the structure of RNA. Prediction approaches have historically avoided searching for pseudoknots because of the extreme combinatorial and time complexity of the problem. Yet neglecting pseudoknots limits the utility of such approaches. Here, an algorithm utilizing structure mapping and thermodynamics is introduced for RNA ...
متن کامل